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SYSTEM AND METHOD FOR MODEL facets of the companies' business operations, including 

MINING COMPLEX INFORMATION kcnd information which may only be gleaned by a critical 

TECHNOLOGY SYSTEMS analysis of key data interspersed across the database(s). 

Unfortunately, because of the sheer volume and/or complex- 

BACKGROUND OF THE INVENTION 5 itv °f tne available information, such trend information is 

typically lost as it is unrecoverable by manual interpretation 

1. Technical Field of the Invention methods or traditional information management systems. 
The present invention relates to complex information The principles of data mining, however, may be employed 

technology systems (IT) and, in particular, to a system and as a tool t0 discover this hidden trend information buried 

method for discovering relations between components in a 30 within the pile of total information available, 

complex IT system, and more particularly, to techniques for Such data mining techniques are being increasingly uti- 

iteratively determining IT system component associations. lized in a number of fields, including banking, marketing, 

2. Background and Object of the Invention biomedical applications and a number of other industries. 
With the exponential growth of the computer and the ^""f com P anies ™* banks h ™ used dala mu ™S for 

computer indusTry, information technology (IT) systems 15 ^ analy^, for example, using data mining methods m 

have become increasingly complex and difficult to manage. "jvestigatmg its own claims databases for relations between 

a „^;.oi it c „ clom • J"' n V., „ .f. client characteristics and corresponding claims. Insurance 

A typical IT system in even a small company may contain . , . ....... .... r 

jr. • j . . companies have obvious interest in the characteristics of 

dozens of computers, printers, servers, databases, etc., each " . F .. " ~* . , t t~ Tr 

component in some way connected to the others across the *™ ^7 holders > P^icularly those exhibitmg nsky or 

interlinkage. A simplified example of an interconnected IT 20 otherwise inappropriate activities or behaviors adverse to the 

c„c**™ ,k rt „,n ;« T7ir 1 j a ;„,i^ • j j 'i l _ companies interests, and with such analyses, are able to 

system is shown in HG. 1, described in more detail here- r . . t . J ^ 

^ a ^ er determme risk-profiles and adjust premiums commensurate 

with the determined risk. 

^though interconnected systems, such as the one shown Data minin has also found t success in direct mar . 

in FIG. 1. offer many advantages to the users, e*. resource ketin strate ^ es . Direct marketing firms are able to deter- 

sbanng as such systems grow and the number of component mbe relalionshi between ^ mMl such „ 

interhnkages increase the behavior of these complex sys- d ]oca)i jn likelihood mal a ^ 

terns becomes more difficult to predict. Further, system d fo > ^ fl kular ^ ^ ^ 

performance begins to lag or becomes inconsistent, even relationshi , hen te ^ , 0 direct mailing B towards 

becommg chaotic m nature The addition or removal of one s ^ th / , es , probability of ^ponding, thus 

component, even seemmgly minor, could have dramatic enhancin the ^^s' prospects and potential profits. In 

consequences on the performance of the whole system^Even utilizin ^ ^ techfliq £ the ^ mpany P mails X 

an upgrade on one component could adversely affect a number 6 of direct m * ketin s 4 ales proposals ^ G ut of these 

disfcnt seemmgly unrelated component. The system and mai]i , , Y Data P ^ techniques are 

method of the present invention is directed to techniques to Jf',. /? . • • • u- u- 1 • * 

u tJ « • .f i_ i_ r 1 ¥ r T^ . \ • 35 l hen apphed to a database containing biographical informa- 

better predict the behavior of complex IT systems, offering ^ ^ whom ^ w £ e P directed . Rda . 

system administrators the opportunity to identify problem ^ factQ £ between ^ ^ ^ ^ did 

areas such as performance bottlenecks and to correct them be determined ^ result b a of the ori ^ mal 

pnor to a system or component failure. wUfa m ^ g ^ h f ve ^ monstra a 

Conventional approaches to system performance mom- 4Q g reat er probability of responding. This subgroup could be, 

toring are inadequate to easily divine the nature of a per- for example, middle-aged, dual-income families with one 

formance problem in a complex IT system since any data child. Future mailings could be directed towards families 

collected in monitoring is generally useless in ascertaining fittmg tnis biographical data. Responses from these familial 

the true nature of the performance dirEculty. The system and groups then be further data mined in relation to the 

method of the present invention, however, provide a mecha- 45 original group to refine the analysis. A process such as this 

nism whereby system monitoring data is made easily acces- cou id be repeated indefinitely, where changes in behaviors 

sible and usable for analyzing current performance and 0 f targeted groups would be recovered over time through 

predicting future performance. The present invention facili- increased amounts of data that is analyzed and with repeated 

tales this analysis through use of data mining principles analysis. In this sense, the data mining analysis beams' from 

discussed further hereinafter. 5Q eacn re p ea ted result. In this example, data mining is used to 

In general, data mining is an analysis of data in a database predict the behavior of customers based on historical analy- 
sing tools which determine trends or patterns of event sis of their behavior. 

occurrences without knowledge of the meaning of the ana- In the same manner demonstrated hereinabove, data min- 

lyzed data. Such analysis may reveal strategic information ing ma y also be employed in predicting the behavior of the 

that is hidden in vast amounts of data stored in a database. 55 components of a complex information technology (IT) sys- 

Typically, data mining is used when the quantity of infor- tem. Similar approaches with appropriate modifications can 

mation being analyzed is very large, when variables of be used to determine how interconnected components influ- 

intcrest arc influenced by complicated relations to other ence each other and for uncovering complex relations that 

variables, when the importance of the variable varies with its exist throughout the IT system. 

own value, or when the importance of variables vary with 60 discussed, multiple applications will be operated 

respect to time. In situations such as these, traditional within a common IT infrastructure, such as the one shown in 

statistical analysis techniques or common database manage- piG. 1. Often, these applications will utilize some of the 

ment systems may fail or become unduly cumbersome. same resources. It is obvious that sharing of IT infrastructure 

Every year, companies compile large volumes of infor- resources among different applications may cause unex- 

malion in databases further straining the capabilities of 65 pected interactions or system behavior and often such unex- 

traditional data analysis techniques. These increasingly pected interactions, being non-synergist are undesirable in 

growing databases contain valuable information on many nature. An example would be multiple business applications 
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sharing a router within an IT system. As illustrated, a extrapolate facts from the programmed knowledge and new 

particular application, e.g., an E-mail service, burdens a data that is input into the system. The knowledge base is a 

router in such a way that other applications do not function compilation of human expertise used to aid in solving 

well. In this example, it is reasonable to expect numerous problems, e.g., in medical diagnosis. The utility of the expert 

applications to, at times, share usage of the router. Tradi- 5 system is, however, limited to the quality of the data and 

tional systems management techniques may prove difficult algorithms that are input into the system by the human 
in determining which specific application is causing loss of expert. 

system performance. This example explains why the need to !L '. 

find hidden relationships among IT system components and Typically, expert systems are developed so that knowl- 

applications running in such environments exist. By way of ed S e ma y be accumulated from a person or persons skilled 

solving the problem in the previous example, it may be m a s ^ hc area of technology and stored in an easily 

necessary to reroute E-mail traffic through another router to retrievable media. This way, persons less skilled than the 

obtain adequate performance for the other applications. ex P erts ' whose knowledge was accumulated within the 

Traditional IT system management is now generally c $* n system ' have access 10 such ^formation. In 

defined as including all the tasks that have to be performed „ thls manne /' may save human and financial 

to ensure the capability of the IT infrastructure of an 15 reSOurCeS by ^ving less skilled personnel access such 

organization to meet user requirements. Shown in FIG. 2 is c f ert 1 f yStems mstead of requiring the expert to handle all 

a traditional IT systems management model, generally des- ° f sltuatl0DS firing a certain level of knowledge, 

ignated by the reference numeral 200. Essentially, there are Utilization of such expert systems allows less skilled 

groups of system administrators 210 having knowledge of 20 P ers °ns t0 also analyze IT systems behavior. These systems 

the IT infrastructure, such as the one shown in FIG. 1 may be used to aid in troubleshooting faults in an IT system 

hereinafter and generally designated herein by reference or lne y mav be t0 assist m predicting such faults with 

numeral 220, which they are managing. Typically, the the assistance of system performance monitors, i.e., a person 

knowledge of the infrastructure 220 is scattered among the with access to an expert system applied to a particular IT 

various personnel comprising the system administrator 25 system may, through appropriate monitors, study system 

group 210. The total of this knowledge is limited to the sum load parameters or the like and through the use of the expert 

of the individual administrators' knowledge, where invari- system, make estimates of potential faults due to system 

ably there is a great deal of redundancy of knowledge. This bottlenecks or the like. 

redundancy may be considered an inefficiency of the overall A significant drawback of expert systems, however, is that 

knowledge base. In other words, a theoretical maximum 30 lhey are poorly equipped to handle newly encountered 

knowledge of the infrastructure 220 would be realized only problems or situations. In this manner, it is clear that expert 

when each individual administrator of the administration systems are limited in their technical capability of resolving 

group 210 had knowledge that was unique to that specific novel issues. Instead, expert systems require a complete 

administrator. While this may appear to be an ambiguous model of all the events or failures that can occur in the 

analysis of the effectiveness of the group, it is of real 35 system being modeled. 

consequence for the company that must finance a group of The present invention is a further progression towards the 

administrators. Furthermore, this knowledge is typically not realization of a fully automated IT management system. In 

stored in an easily retrievable electronic form. a manner similar to the way in which data mining techniques 

When system monitoring is included in the aforemen- are applied to predict the behavior of, for instance, the 

tioned traditional management system, this monitoring is 40 customers in the direct marketing example, the idea of such 

usually limited to real time data, such as the current system techniques may be applied to complex IT system models in 

load and the like. An administrator may observe such determining causal relations between IT system compo- 

reporting of real time data, and if system loads or events nents. The system and method of the present invention when 

being monitored are noticed to be consistent with loads that implemented determine how the interlinked components 

the administrator recognizes to be associated with impend- 45 influence each other in terms of performance, potentially 

ing system malfunction or loss of performance, that admin- uncovering unexpected relations among different compo- 

istrator may redirect part of the load through alternative nents of an IT system and automatically creating or updating 

subsystems of the IT infrastructure. causal association models of such systems. This is accom- 

Often, such real time data reporting may be used in plished through the use of association rule induction metb- 

coordination with a system model of the IT system, of which 50 m conjunction with other data mining techniques 

data is being collected and reported. The model usually applied on historical data sets of system state data, 

includes a computer algorithm that utilizes code governing It is clear that with today's increasingly interconnected 

the relations among various system devices. A problem with and complex IT infrastructures and the corresponding 

such models, however, is that the relations used in modeling increases in maintenance costs of such systems, a system 

the system account only for expected interactions among 55 and method for discovering causal relationships between 

components and subsystems. The model is, therefore, various subsystems and elements of such complex networks 

merely an idealized model of the actual system. Hidden or in a substantially automated manner is certainly a valuable 

unexpected relations that exist between components would tool. 

not be accounted for. Furthermore, as the infrastructure 220 It is also an object of the present invention to have an 
is modified, the model must be manually altered to include 60 automated means of accumulating the assortment of data 
new relations in the model algorithm to account for the that may be analyzed by an appropriate data mining tech- 
changes made. n jq Ue SUCD that performance models of complex IT systems 
An improvement over this traditional management system based on periodic measurements of predefined performance 
is realized in the so-called expert system. An expert system levels may be generated or updated. Additional description 
is a form of artificial intelligence in which a computer 65 on the collection of monitoring data and application of data 
program containing a database, frequently referred to as a mining techniques may be found in Applicants' co-pending 
knowledge base, and a number of algorithms used to patent application, U.S. patent application Ser. No. 09/036, 
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393, entitled "A System and Method for Generating Perfor- forms and should not be construed as limited to the embodi- 
mance Models of Complex Information Technology ments set forth herein; rather, these embodiments are pro- 
Systems", filed concurrently herewith, which is incorporated vided so that this disclosure will be thorough and complete, 
herein by reference. and will fully convey the scope of the invention to those 

Another desirable feature of an IT system, such as one 5 skilled in the art. 

incorporating the improvements of the present invention, is FIG. 3 shows a model of an adaptive system management 

to reduce the amount of human intervention required for the scenario in accordance with the system and method of the 

system to adapt to dynamic system changes. This is prefer- present invention. The application of data mining for causal 

ably accomplished through automation. relations of system components on an information technol- 

It is further desired that the system and method of the 10 °gy (TO system, designated by the reference numeral 305, 

present invention analyze system performances with Bool- k illustrated in FIG. 3, in which the IT system 100/305 is 

ean attributes, i.e., true or false. connected to at least one monitor 310 that monitors the 

performance of the IT system 305. The monitor 310 is 

SUMMARY OF THE INVENTION connected to a historical database 315, which is used to store 

_ , 15 various performance measurements on the IT system 100. 

The present invention is directed to a system and method ^ ^^1 database 315, in turn, is connected to a 

for automatically creating causal association models of number of learni algorit hms 320. These learning algo- 

complex information technology (IT) systems by use of rithms uses associalion ^ M}lcl i on methods wdl lasam 

association rule induction methods, preferably in conjunc- t0 those sldlled m the data minin commuility t0 actL te 

tion with other data mining techniques. System state infer- 20 caU sal association models of the IT system. Elements or 

mat ion is periodically recorded by system monitors placed events rdatin to the , T tem Qr mfrastructure 305 are 

throughout the IT system. This state information is then mon itored throughout the system by appropriate monitoring 

stored in a database with the system model. schemes housed within the monitors 310. 

A model of the IT system environment is developed in Data from the afo remcntioncd mon itoring is forwarded by 

terms of system components and relations between them. 25 me monitors no aQd • ^ ^ dalabase 31 / 

This model may be defined with any level of detail and does ^ data ^in the historical database 315, including the 

not necessanly have to be complete or consistent. newly updated inforrnation on the IT system 305 perfor . 

Thresholds are defined in terms of monitoring events. mance is then subjected to an association rules algorithm of 

These thresholds are used to convert the monitored state the learning algorithms 320. The association rules algorithm 

information from its monitored numeric format to Boolean 30 320 may confirm or refute associations in the existing IT 

values. Target components are then selected and an asso- system model, or, it may recover associations which are not 

ciation rules algorithm searches for associations with other considered in the model. The learning algorithms 320 then 

components based on the Boolean values obtained from update an adaptive model of the IT infrastructure, designated 

comparisons between monitored state information and asso- by the reference numeral 325 by adding, removing, or 

ciated thresholds. The probability of causal relation between 35 altering existing modeled associations, 

components are indicated by sets of association rules. ^ management environment stores all collected infor- 

Causal relations implied by the model may then be con- mation ^ uses various learning techniques to learn about 

firmed or refuted. Causal relations discovered that are not me 1T tem 305 bein managed . It should ^ underst0 od 

implied by the model may indicate the model is incomplete. that the aforementioned Naming' algorithms are well- 

In this manner, the causal relations of the model may be 40 known to those skilled in the afl leami l6chniques 

refined to more accurately model the system environment. enable ^ managemem environment to better adapt itself to 

BRIEF DESCRIPTION OF THE DRAWINGS J* 1 ™? ^ ""f 0 ? ^ TT^ * ?T f 0 ™' 

tion will then be collected and stored so that the learning 

A more complete understanding of the system and method 45 process continues. In fact, the entire monitoring, learning, 

of the present invention may be had by reference to the and adapting process provided by the system and method of 

following detailed description when taken in conjunction lDe present invention is continuous and iterative, 

with the accompanying drawings wherein: In devising such a dynamic learning model as disclosed in 

FIG. 1 is an exemplary network system upon which the the present invention, it is first necessary to define thresholds 

system and method of the present invention may be 50 f° r various system performances. These thresholds are 

employed; required since, when monitoring and recording system state 

FIG. 2 is a block diagram of a traditional IT systems da f a » numerical dala * recovered. This numerical data is ill 

management method; suited for typical association rules algorithms that are to be 

T?if~i * • ui i j- e • • . «• utilized in the present invention. By defining thresholds for 

FIG. 3 ts a block diagram of a system and method for each ific ^ f . d d > fa 8 ; , f 

u?vemIon Sy and m managemeBl " aCC ° rdaDCe Wth ,he the monitored data is converted to a Boolean value by simple 

; comparison between the monitored data and its correspond- 

FIG. 4 is a schematic diagram illustrating causal modeling mg threshold. 

between two simple elements within an IT system. A , c u u i j • j lL 

v 3 As an example of such a threshold, consider the exem- 

DETAILED DESCRIPTION OF THE 60 P^ ar ^ ^ system depicted in FIG. 1 on which the present 

PRESENTLY PREFERRED EXEMPLARY invention may find application. Database 105, also denoted 

EMBODIMENTS herein by the reference identifier A, is resident on a server 

110 such that numerous and diverse users may query the 

The present invention will now be described more fully database 105. In querying such a database, it is reasonable 

hereinafter with reference to the accompanying drawings, in 65 that a login must be first performed. This login may con- 

which preferred embodiments of the invention are shown. ventionally be performed through another database, in this 

This invention may, however, be embodied in many different example database 115, also referred to herein as database B. 
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Therefore, for a system user, such as at a remote computer may be very valuable. The numeric value of the cumulative 

140, to remotely query database A, login is executed through disk accesses may be converted to a useful Boolean attribute 

database B, which upon a successful login grants a user in a manner similar to the previous example where the 

rights to query database A. For this entire operation, a Boolean value would indicate a load above or below the 

threshold may be established by knowledgeable manage- 5 component's average load. For the monitor of the current 

ment personnel, designated in FIG. 2 by the reference example, herein denoted monitor^ the conversion to a 

numeral 230. Typically, such a threshold would be formed Boolean value would be given according to the following 

with knowledge of the server performances on which data- logical expression: 

bases A and B reside and a general knowledge of data traffic If_Monitory>TH(Average_Monitor r )-then (Higher_ 

through these servers. 10 than_average) y -True 

In the example depicted in FIG. 1, for instance, such a u Else^Higher than^average)^ False 

query may be executed from computer 140 or another He^i^ 

computer 145. Therefore, an obvious effect of such a query £ 1™^°?*! ? ^ ^ ,7 ^ ^ ll 

n .~»iA u .u c u u i . would not be necessary to define a threshold, as the average 

would be on the performance of a hub element 135. value of ^ (Average_Monitor r ), serves as a 

However, in such a network a number of elements may 15 and fe determined b lhe historical ^ of bads 

often exist that could unknowingly effect system on component Y. Thus, monitored component state data may 

performance, such as the database query of the ongoing be converted to Boolean values by the use of either defined 

discussion, a network printer 165, another printer 155, a 0 r discovered thresholds. In this way, every monitored 

server 160 or any of a wide variety of system hardware component state is able to be described as either in a high or 

resources or software applications running on system 100. A 20 i ow state . Furthermore, it should be noted that thresholds 

further discussion of potential deleterious effects within a may be static, i.e., TH^, or dynamic, i.e., TH(Average_ 

network, such as shown in FIG. 1, may be found in Appli- Monitor^). 

cants' aforementioned co-pending patent application. In order to apply the aforementioned data mining tech- 

For the aforementioned database query example, assume ^ association rules algorithms to historical data on 

the startup time of database A is a reasonable measure of the 25 the *7 s ^ l t m 30 . 5 ' } { } s ^ necessary to build the afore- 

performance of database A. Therefore, the targeted perfor- I I haS - bCCn 

mam* level of database A, or a performance threshold, t , 

could be constructed from the access times of databases A ^ a ^nal database. Furthermore, each monitor 310 may or 

nnA r> c- u u « . » may not have its own local memory. Typically, all monitored 

and B. Since access time has been assumed to be a good A * „ rAll1f1 , . , 4 v\ / 

_ c c ci_ i* . 30 data would be directed to one central storage location, 

measure of performance of such an application, total access JU u ... . 0 . * V ,T .* " 

time for database A may include the access time of database S^jS" f * T t ™ Ti 

B since effective execution of database A is prolonged by the f tored ^ "* ««P«*ve monitor 310 and then later sent 

execution of database B. For this case, the total access time, ^ a PP hcatl ° ns ^ be 

(AT)ab> Mr startup of database A may be found from the sum n ... „.,„ . . . ., ,.,,.„ 

of the startup times of the individual databases, AT„ and 35 • f D ** ™ ™ . ™* P *™Bb°ut the IT 

AT in other words infrastructure 305 at various components within the system. 

' Monitoring activity may be directed to any number of 

(AiXur Al>AT fl components with, in general, the overall effectiveness of the 

present invention enhanced with a corresponding increase in 

Assume that study of the individual applications and hard- 40 tne number of monitors 310 being utilized. These monitors 

ware from which execution of these applications are preferably perform their specific monitoring activity 

executed indicates that it is reasonable for execution of automatically and at specific time intervals. The type of data 

database B to take place in no more than 1 second and being monitored and stored in the historical database 315 

subsequent execution of database A in no more than 2 mav De generally described as state or usage information on 

seconds. From this information, the target for total startup 45 a component level, i.e., a harddisk, database, or a network 

time of database A, (AT) AB , would be for execution of A in segment. For instance, a monitor 310 used to monitor and 

no longer than 3 seconds. This threshold (TH) would appro- record historical data on a particular harddisk may record 

priately be recorded as: free capacity of the disk and whether the disk is being 

accessed or not. Similar data collected from monitoring a 
TH^«3_seconds 50 database may include the number of users accessing the 
. , . database, query volume, and access time. 
This threshold would indicate, in a Boolean format, that A final mput to ^ considered is the original system 
execution of database A in a time of less than or equal to 3 modeL ^ e mo del of the IT system 305 should be developed 
seconds is satisfactory and an execution time exceeding 3 in terni s of system components and relations between these 
seconds is unsatisfactory and may be recorded in a form J5 components. Abetter understanding of development of such 
similar to the following: a model may 5e had with re f er ence to FIG. 4. This figure 
If_acccss_timc_of_A^TH A ^-A_pcrformancc_is_ depicts a simplified example relationship between two corn- 
good mon IT system components, a Mail Server 400 and File 
Elsc-.A_pcrformance_is_poor Server 410, and respective clients 420, 430 and 440, 
The previous example utilizes a defined threshold, TH^. 60 a tt a ched thereto. The original model is derived from 
However, some thresholds may not be defined but rather expected interactions among such components. From 
discovered. As an example of such a threshold, consider a inspection of FIG. 4, the following relations would likely be 
monitor that collects (counts) the number of disk accesses to expected and would therefore be included in the original 
a specific hard drive. Often limes, the total number of disk system model: 

accesses may be of little or no value, i.e., resetting of the 65 mail_server_down-client_mail_not_readable 

monitor counter may yield the monitored values useless. File__server_down-File_server_client_response_ 

However, the average number of disk accesses per unit time unacceptable It should be apparent to those skilled in 
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the art that the exact modeling technique is 
inconsequential, as any number of algorithmic lan- 
guages may be used in development of such a model. 

Using the model, monitored component and system state 
information, and the Boolean values obtained from the 
thresholds and monitored data as input into the association 
rules induction algorithm, the algorithm searches for rela- 
tions among system state data by techniques well known to 
those in the data mining community. These relations are then 
compared to expected relations as predicted by the current 
system model given identical system state input. In this 
manner, causal relationships in the model can be confirmed 
or refuted, allowing the model to be updated to more 
accurately model the real IT system. 

In the previous example in connection with FIG. 4, for 
instance, the association rules algorithm may discover over 
some time that client mail at Mail Client 420 or another Mail 
Client 430 is not readable whenever the File Server 410 is 
down. This would indicate that the Mail Server 400 and File 
Server 410 are not independent elements. In this case, the 
following rules or their equivalent, would be generated from 
the association rules induction algorithm: 

File„server_down-Mail_server_down 

File__server_down-»client_mail_not_readable 
The model would then be updated, either by an administrator 
or automatically, to reflect their newly discovered causal 
relations between the Mail Server 400 and File Server 410 
services. 

It should be obvious that the above example is an ideal- 
ized and simplistic example. Typically, causal relations 
among various components would not be so discrete but 
instead would be determined to have complex probability 
relations among one another. These types of relations are 
often expressed with relative 'weights* among each other. In 
such, the model may not be adapted by simply removing or 
adding causal relations. Often, new information that is 
discovered may relate to thresholds existing between com- 
ponents. In the previous example, a threshold may be 
determined such that when the Mail Server 400 is serving X 
number of clients during a period of time, there is a 
probability Y of the File Server 410 failing. New informa- 
tion uncovered relating to the causal relation between the 
Mail Server 400 and File Servers 410 of the current example 
may simply be to adjust the threshold of the Mail Server 400 
load, X, at which performance of the File Server 410 begins 
to become faulty. 

In general, association rules algorithms work only with 
binary values. Of consequence is the necessity of converting 
monitored values into binary values. This is accomplished 
per the previously discussed thresholds and conversion 
techniques. These Boolean values are then used as descrip- 
tors for system component states to be input into the 
association rules algorithm. The Boolean values are simply 
used to describe the monitored state of components. It is 
these Boolean values that allow the use of such association 
rules algorithms. In general, the association rules algorithm 
verifies or refutes causal association rules of the model, 
where such rules have the following general form: 

Monitor A (antecedent)-*Monitor B (consequent) 
(confidence x%, support y%) 
Here, the antecedent and consequent are simply character- 
izations describing the states of the respective monitors. The 
confidence is the probability that the consequent is true 
given that the antecedent is true. Support is the percentage 
of cases within the total data set for which the rule is found 
to be true. 

In the example of the Mail Server 400 and File Server 410 
depicted in FIG. 4, the input to the association rules algo- 
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rithm consisted of the historical data set of historical system 
component state data, thresholds relating to component state 
data, and the system component relations that would cur- 
rently reside in the system model. These modeled relations 
5 for this specific example are again given below: 
mail_jserver_down-client_maiL_not_readable 
File_server__down-File_server__client__response_ 
unacceptable 

If a system administrator determined that the system 

30 model did not accurately reflect system components' effect 
on, for instance, the Mail Server 400 in FIG. 4, the admin- 
istrator may select the Mail Server 400 as a target compo- 
nent to be analyzed by the system and method of the present 
invention. The historical database of system component 

15 state data and this data's associated thresholds would then be 
analyzed by techniques known to those skilled in the art, 
e.g., through analysis of decision nodes of decision trees as 
described in more detail in Applicants' copending patent 
application. Components with apparent interactions with the 

20 Mail Server 400 would be identified. The components 
thresholds are then used to convert the numerical state data 
to Boolean attributes describing such states. The system 
component state data, now described with discrete Boolean 
attributes, are then input into an association rules algorithm 

25 to search for causal relations on the target component, which 
is the Mail Server 400 in this example. 

The output of the process would be a new set of associa- 
tions between the target component and other components. 
The output may be manifested in a number of ways. In the 

30 preferred embodiment of the present invention, a initial set 
of association rules is generated. These rules are then 
compared to the association rules in the model. In this way, 
common algorithm coding techniques can render the output 
process substantially automatically. Results of the compari- 

3 son between the discovered set of association rules and those 
rules in the model would have two possible outcomes: 
association rules discovered that are coincident with those 
rules in the model and association rules discovered that are 
not included in the model Therefore causal relations not 

40 known or implied by the model would be determined in a 
new set of association rules output by the association rules 
algorithm, and, association rules discovered that were 
implied by the model would give further corroboration by 
the association rules algorithm output. In the example of 
FIG. 4, possible association rules that may be discovered by 
the association rules algorithm include the following: 

mail_scrvcr_down--client_maiL_not readable 

File_server__down-»File_server_client_response_ 

50 unacceptable 

File_server__down-Mail___server_down 
FiIe_server_dowr**client_jnaiL_not_readable 
The first two association rules output are noted to be 
coincident with those association rules of the model. 

55 However, the last two association rules are not included in 
the model. 

The system and method of the present invention may be 
realized in an automated fashion by having the newly 
discovered association rules automatically added to the 

60 model by means of appropriate coding techniques. However, 
such an automated means may not be desirable. In such a 
case, the newly discovered association rules that are not 
included in the model may be output to a system adminis- 
trator to observe or verify before the model is altered. 

65 The benefits of such a system and method as disclosed in 
the present invention are numerous. The development of 
association rules between components in a complex IT 
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system utilizing data mining and association rules induction repeating steps (a)-(d) a plurality of times, said collecting 

methods would greatly reduce the man hours required in step continuously collecting said performance data 

detecting fault causes in such a system. Another obvious over other time periods. 

benefit would be the verification of a system model. System 5. The method according to claim 1, wherein if, after said 

models would be able to be verified for accuracy and 5 step of comparing, at least one of said causal relationships 

completeness. Unknown interdependencies between system fails to match said adaptive system model, adding said at 

components may be discovered thus allowing administrators least one causal relationship to said adaptive system model, 

to take appropriate actions to increase system performance. 6. The method according to claim 1, wherein if, after said 

An additional benefit is seen to be that the system model step of comparing, said plurality of causal relationships fail 

need not be complete or precise, as over time, incomplete or to match said adaptive system model, alerting a system 

erroneous component associations may be accounted for, administrator of said failure 

thus allowing the model to be adjusted to more accurately 7 method according l0 claim h fo nhcT comprising, 

reflect relations among components of the system. rior to st (a) ^ st of . 

Throughout the discussion of the present invention, con- • ♦ . * a a ^ *u u \a 

sideration has been given to essentially two system compo- runm "? a tes ' * TO &™ tc \ define a P erforraance threshold 

nents and the discovery of causal relations between them. It 15 J wthm s y stem i and 

should be apparent that the described system was provided determining said plurality of nodes associated with said 

for simplification of discussion and that the present inven- 0 performance threshold. 

tion would find implementation on a vast array of system 8 * ^ method according to claim 7, wherein said target 

architectures and complexities, with components numbering component is selected from the group consisting of a system 

in the hundreds, or even thousands. 20 hardware resource, a system software application and a 

As discussed, further description on additional features of s*™ 0 * level agreement, 

the preferred embodiments of the present invention may be 9 ' method according to claim 1, further comprising, 

found in Applicants' co-pending patent application, incor- P nor t0 said ste P of comparing, generating said adaptive 

porated herein by reference. s y stem modeL 

Although a preferred embodiment of the system and 25 10 nc melhod according to claim 9, wherein said 

method of the present invention has been illustrated in the adaptive system model is generated from the selection of a 

accompanying drawings and described in the foregoing target component within said system to model, a plurality of 

detailed description, as well as in Applicants' aforemen- said causal relationships determining a plurality of model 

tioned co-pending patent application, it will be understood nodes said adaptive system model associated with 

that the invention is not limited to the embodiment 30 said tar S et component. 

disclosed, but is capable of numerous rearrangements, modi- n ^ method according to claim 10, wherein said target 

fications and substitutions without departing from the spirit component is selected from the group consisting of a system 

of the invention as set forth and defined by the following hardware resource, a system software application and a 

claims, service level agreement. 

What is claimed is: 35 ^* ^ e metQ od according to claim 1, wherein said 

1. In an information technology system having a multi- performance data in said step of continuously collecting is 
plicity of interconnected nodes, a method for modeling the automatically collected periodically over said given time 
performance of said system, said method comprising the penod. 

steps of: 13. The method according to claim 12, wherein said given 

(a) in an interactive manner, continuously monitoring, at 40 * ime J* 1 " fo j continuously collecting said performance 
a plurality of said nodes, the performance of said data 15 selected ^ lhe S rcu P consisting of days, hours, 
system* minutes and seconds. 

/U \ ■ , . jx i ii 14. The method according to claim 1, further comprisine, 

(b) ,o an automated manner, conunuously collecting, at after ^ f Mntiml0 £, m ^ and rior P to 

sa.d plurality of nodes, performance data for said f detef £ ini , he st o£ g P 

system over a given time period; , . ^ f , 

z v , # . . c - . * ij storing said performance data. 

(c) determining, from said performance data of saxd 15 ^ e ^ ^ ^ 

2'3vi 7 VI , CaUS f !, la H 0n L h,PS b ?v " n , a Performance data is stored Jftfa an associated time stamp, 

multiphcty of said mterconnected nodes wtflun sa.d 16 ^ method according tQ ^ „ ^ 

,,v_ ... c . . . * «_ • 50 performance data and associated time stamp are stored in a 

(d) comparing said plurality of causal relationships within relational database 

said system with an adaptive model of said system, said l7 m melhod according t0 claim 16 ^ wherein ^ 

adaptive system model modeling a portion of said adaptive syslem mode] is stor ^ in said relational database 

intormation technology system; and 18. The method according to claim 1, wherein said 

(e) in an automated manner, updating said adaptive sys- 55 performance data in said step of continuously collecting is 
tern model according to newly discovered causal rela- selected from the group consisting of real numbers, integers 
tionships. and Booleans. 

2. The method according to claim 1, further comprising 19 , melhod according to claim 1, wherein said 
after said step of comparing, the step of: performance data in said step of continuously collecting is 

(e) modifying said system model. 60 converted to a plurality of Boolean values. 

3. The method according to claim 1, further comprising 20. The method according to claim 19, wherein said 
the step of: plurality of Boolean values correspond to a plurality of 

repeating steps (a)-(e) a plurality of times, said collecting performance threshold conditions, 

step continuously collecting said performance data 21. The method according to claim 20, wherein said 

over other time periods. 65 performance threshold conditions are predetermined. 

4. The method according to claim 1, further comprising 22. The method according to claim 20, wherein said 
the step of: performance threshold conditions are variable. 
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23. The method according to claim 19, wherein said 35. The system according to claim 34, wherein said target 
performance data is converted to a plurality of Boolean component is selected from the group consisting of a system 
values by use of predetermined thresholds. hardware resource, a system software application and a 

24. The method according to claim 19, wherein said service level agreement. 

performance data is converted to a plurality of Boolean 5 36. llie system according to claim 29, further comprising: 

values by use of newly discovered thresholds. model generation means for generating said adaptive 

25. The method according to claim 1, wherein said system model. 

performance data in said step of continuously collecting is 37. The system according to claim 36, wherein said model 

averaged over said given time period. generation means selects a target component within said 

26. The method according to claim 25, wherein said 10 system to model, a plurality of said causal relationships 
averaged performance data is converted to at least one determining a plurality of model nodes within said adaptive 
Boolean value. system model associated with said target component. 

27. The method according to claim 26, wherein said at 38. The system according to claim 37, wherein said target 
least one Boolean value corresponds to at least one service component is selected from the group consisting of a system 
level agreement within said system. hardware resource, a system software application and a 

28. A program storage device readable by a machine and 15 service level agreement. 

encoding a program of instructions for executing the method 39. The system according to claim 29, wherein said 

steps of claim 1. automated collection means continuously collects said per- 

29. An information technology system having a multi- formance data periodically over said given time period, 
plicity of interconnected nodes, said system comprising: 40. The system according to claim 39, wherein said given 

an iterative monitor means for continuously monitoring, 20 time period for continuously collecting said performance 

at a plurality of said nodes, the performance of said da . ta k selected from the group consisting of days, hours, 

system at the respective nodes; minutes and seconds. 

an automated collection means for continuously 41. Tlie system accordmg to claim 29, former comprising: 

collecting, at said plurality of said nodes, performance stora S e meaos for stonn 8 said Performance data continu- 

data for said system over a given time period; 25 ousIv collected by said automated collection means. 

determining means for determining, from said perfor- <?' ™ e s >; sl t em a ^>raing to claim 41 wherein said 

mance data of said system, a plurality of causal rela- ^™™<* d f 1S stored with an associated time stamp^ 

. ^ J l4 . /. ./ - 7. . t v ' au001 iVia 43. The system according to claim 42, wherein said 

tionships between a multiplicity of said interconnected slorage k a relational * database . 

nodes within said system; 3Q 44 Xhe system according to claim 43> wherein ^ 

comparison means for comparing said plurality of causal adaptive system model is stored in said relational database, 

relationships within said system with an adaptive 45. The system according to claim 29, wherein said 

model of said system, said adaptive system model performance data continuously collected by said automated 

modeling a portion of said information technology collection means is selected from the group consisting of 

system; and 35 real numbers, integers and Booleans. 

an automated updating means for updating said adaptive 46 • The system according to claim 29, wherein said 

system model according to newly discovered causal performance data continuously collected by said automated 

relations after the comparison by said comparison collection means is converted to a plurality of Boolean 

means. values. 

30. The system according to claim 29, further comprising: ^ , 47- The system according to claim 46, wherein said 

modification means for modifying said system model P 1 ^ 1 ^ of Boolean values correspond to a plurality of 

after the comparison by said comparison means. ' Performance threshold conditions, 

rpi . r mm 1 « *%t% l. 1 4 s - system according to claim 47, wherein said 

31 The system according to claim 29, wherein sa.d performance { hreshold condi f ions m predetermined. 

%T n l TZZ itt TT n °7 y r 0 " 01 ?/* 5 em P, er : 49. The system according to claim 47, wherein said 

formance at another plurality of nodes, said automated 4J performance threshold conditions arc variable, 

co lection means continuously collecte other performance 50 ^ accordj , 0 daim 46 wherein ^ 

data for said system over another tune pcnod, sa.d deter- performance ' dala j, conver * d l0 a |ura| , of Boolea „ 

mining means determines from said other performance data b use of redetermined thresh P o|ds . y 

another plurality of causal relauonships be weep another 51 ^ tem according to claim 46, wherein said 

multiplicity of said interconnected nodes within said system performance data is converted to a pluraUty of Boolean 

said comparison means compares sa.d another plurality of * alues b ^ of newl tb/esbolds 

causal relationships with said adaptive system model, and 52 ^ cMm 2% ^ 

said automated updating means further updating said adap- performance dala continuous f y coUe cted by said automated 

lve sysem mo e . ..,..„.,..... collection means is averaged over said given time period. 

32. The system according to claim 29, wherein if said 53 ^ &(XO[ % , 0 cUim S 2 , wherein said 
comparison means fads to match at least one ot sa.d 1 causal 55 ^ / formance ^\ Q „ ^ 0QC 
relationships with said adaptive system model, said auto- Boolean value 

mated updating means adds said at least one causal rela- g4 ^ ; em accordi t0 claim 53 wherein Mid >t 

fonship to said adaptive system model. fc one ^ ^ va , u6 « ^ • 

33. rhe system according to clam, 29, further comprising: , eve , a ent ^in said system. 

alerting means for alerting a system administrator if said 60 55. ^ article of manufacture comprising a computer 

plurabty of causal relationships fail to match said med i um having computer readable program code 

adaptive system model. means embodied thereon for modeling the performance of 

34. The system according to claim 29, further comprising: an information technology system having a multiplicity of 
testing means for running a test program to define a interconnected nodes, the computer readable program code 

performance threshold within said system, said testing 65 means in said article of manufacture comprising: 

means determining said plurality of said nodes associ- computer readable program code means, executed by the 

ated with said performance threshold. computer, for performing steps for 
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(a) in an iterative manner, continuously monitoring, at 
a plurality of said nodes, the performance of said 
system; 

(b) in an automated manner, continuously collecting, at 
said plurality of nodes, performance data for said 
system over a given time period; 

(c) determining, from said performance data of said 
system, a plurality of causa] relationships between a 
multiplicity of said interconnected nodes within said 
system; 



16 



(d) comparing said plurality of causal relationships 
within said system with an adaptive model of said 
system, said adaptive system model modeling a 
portion of said information technology system; and 

(e) in an automated manner, updating said adaptive 
system model according to newly discovered causal 
relationships. 
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INVENTOR(S) : Pieter Willem Adriaans et al. 



It is certified that error appears in the above-identified patent and that said Letters Patent is 
hereby corrected as shown below: 



Column 11, 

Line 40, replace "interactive manner" with - iterative manner - 
Lines 58-60, delete "The method according to 

claim 1, further comprising after 

said step of comparing, the step 

of: 

(e) modifying said system model." 

Column 13, 

Lines 39-41, delete "The system according to claim 
29, further comprising: 
modification means for modifying said 
system model after the comparison by 
said comparison means." 
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JAMES E.ROG AN 
Director of the United States Patent and Trademark Office 
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